这项研究提出了依赖电压突触可塑性(VDSP),这是一种新型的脑启发的无监督的本地学习规则,用于在线实施HEBB对神经形态硬件的可塑性机制。拟议的VDSP学习规则仅更新了突触后神经元的尖峰的突触电导,这使得相对于标准峰值依赖性可塑性(STDP)的更新数量减少了两倍。此更新取决于突触前神经元的膜电位,该神经元很容易作为神经元实现的一部分,因此不需要额外的存储器来存储。此外,该更新还对突触重量进行了正规化,并防止重复刺激时的重量爆炸或消失。进行严格的数学分析以在VDSP和STDP之间达到等效性。为了验证VDSP的系统级性能,我们训练一个单层尖峰神经网络(SNN),以识别手写数字。我们报告85.01 $ \ pm $ 0.76%(平均$ \ pm $ s.d。)对于MNIST数据集中的100个输出神经元网络的精度。在缩放网络大小时,性能会提高(400个输出神经元的89.93 $ \ pm $ 0.41%,500个神经元为90.56 $ \ pm $ 0.27),这验证了大规模计算机视觉任务的拟议学习规则的适用性。有趣的是,学习规则比STDP更好地适应输入信号的频率,并且不需要对超参数进行手动调整。
translated by 谷歌翻译
在最近的方法论文中,我们展示了如何使用当地集合卡尔曼滤波器来学习混沌动力学以及状态轨迹。在这里,我们更系统地调查使用具有协方差定位或本地域的本地集合卡尔曼滤波器的可能性,以便检索状态和密钥全局和本地参数的混合。全局参数旨在代表代理动态核心,例如通过神经网络,这些核心让人想起数据驱动的动态机器学习,而本地参数通常代表模型的强制。针对联合状态和参数估计,提出了一种用于协方差和局域定位的一系列算法。特别是,我们展示了如何使用诸如本地集合变换卡尔曼滤波器(LetkF),这是一个固有的本地方法的本地域集合Kalman滤波器(ENKF)严格更新全局参数。使用几种本地ENKF味道在40变量LORENZ模型上取得了成功测试方法。最终提供基于多层Lorenz模型的二维图示。它使用辐射状的非本地观测。它具有本地域名和协方差本地化,以便学习混沌动态和本地强制。本文始终涉及全局和本地模型参数的在线估计的关键问题。
translated by 谷歌翻译
Partial differential equations (PDEs) are important tools to model physical systems, and including them into machine learning models is an important way of incorporating physical knowledge. Given any system of linear PDEs with constant coefficients, we propose a family of Gaussian process (GP) priors, which we call EPGP, such that all realizations are exact solutions of this system. We apply the Ehrenpreis-Palamodov fundamental principle, which works like a non-linear Fourier transform, to construct GP kernels mirroring standard spectral methods for GPs. Our approach can infer probable solutions of linear PDE systems from any data such as noisy measurements, or initial and boundary conditions. Constructing EPGP-priors is algorithmic, generally applicable, and comes with a sparse version (S-EPGP) that learns the relevant spectral frequencies and works better for big data sets. We demonstrate our approach on three families of systems of PDE, the heat equation, wave equation, and Maxwell's equations, where we improve upon the state of the art in computation time and precision, in some experiments by several orders of magnitude.
translated by 谷歌翻译
Unbiased learning to rank (ULTR) studies the problem of mitigating various biases from implicit user feedback data such as clicks, and has been receiving considerable attention recently. A popular ULTR approach for real-world applications uses a two-tower architecture, where click modeling is factorized into a relevance tower with regular input features, and a bias tower with bias-relevant inputs such as the position of a document. A successful factorization will allow the relevance tower to be exempt from biases. In this work, we identify a critical issue that existing ULTR methods ignored - the bias tower can be confounded with the relevance tower via the underlying true relevance. In particular, the positions were determined by the logging policy, i.e., the previous production model, which would possess relevance information. We give both theoretical analysis and empirical results to show the negative effects on relevance tower due to such a correlation. We then propose three methods to mitigate the negative confounding effects by better disentangling relevance and bias. Empirical results on both controlled public datasets and a large-scale industry dataset show the effectiveness of the proposed approaches.
translated by 谷歌翻译
G-Enum histograms are a new fast and fully automated method for irregular histogram construction. By framing histogram construction as a density estimation problem and its automation as a model selection task, these histograms leverage the Minimum Description Length principle (MDL) to derive two different model selection criteria. Several proven theoretical results about these criteria give insights about their asymptotic behavior and are used to speed up their optimisation. These insights, combined to a greedy search heuristic, are used to construct histograms in linearithmic time rather than the polynomial time incurred by previous works. The capabilities of the proposed MDL density estimation method are illustrated with reference to other fully automated methods in the literature, both on synthetic and large real-world data sets.
translated by 谷歌翻译
Neural Radiance Fields (NeRFs) are emerging as a ubiquitous scene representation that allows for novel view synthesis. Increasingly, NeRFs will be shareable with other people. Before sharing a NeRF, though, it might be desirable to remove personal information or unsightly objects. Such removal is not easily achieved with the current NeRF editing frameworks. We propose a framework to remove objects from a NeRF representation created from an RGB-D sequence. Our NeRF inpainting method leverages recent work in 2D image inpainting and is guided by a user-provided mask. Our algorithm is underpinned by a confidence based view selection procedure. It chooses which of the individual 2D inpainted images to use in the creation of the NeRF, so that the resulting inpainted NeRF is 3D consistent. We show that our method for NeRF editing is effective for synthesizing plausible inpaintings in a multi-view coherent manner. We validate our approach using a new and still-challenging dataset for the task of NeRF inpainting.
translated by 谷歌翻译
Co-clustering is a class of unsupervised data analysis techniques that extract the existing underlying dependency structure between the instances and variables of a data table as homogeneous blocks. Most of those techniques are limited to variables of the same type. In this paper, we propose a mixed data co-clustering method based on a two-step methodology. In the first step, all the variables are binarized according to a number of bins chosen by the analyst, by equal frequency discretization in the numerical case, or keeping the most frequent values in the categorical case. The second step applies a co-clustering to the instances and the binary variables, leading to groups of instances and groups of variable parts. We apply this methodology on several data sets and compare with the results of a Multiple Correspondence Analysis applied to the same data.
translated by 谷歌翻译
Co-clustering is a data mining technique used to extract the underlying block structure between the rows and columns of a data matrix. Many approaches have been studied and have shown their capacity to extract such structures in continuous, binary or contingency tables. However, very little work has been done to perform co-clustering on mixed type data. In this article, we extend the latent block models based co-clustering to the case of mixed data (continuous and binary variables). We then evaluate the effectiveness of the proposed approach on simulated data and we discuss its advantages and potential limits.
translated by 谷歌翻译
We explore the abilities of two machine learning approaches for no-arbitrage interpolation of European vanilla option prices, which jointly yield the corresponding local volatility surface: a finite dimensional Gaussian process (GP) regression approach under no-arbitrage constraints based on prices, and a neural net (NN) approach with penalization of arbitrages based on implied volatilities. We demonstrate the performance of these approaches relative to the SSVI industry standard. The GP approach is proven arbitrage-free, whereas arbitrages are only penalized under the SSVI and NN approaches. The GP approach obtains the best out-of-sample calibration error and provides uncertainty quantification.The NN approach yields a smoother local volatility and a better backtesting performance, as its training criterion incorporates a local volatility regularization term.
translated by 谷歌翻译
Differentiable Search Indices (DSIs) encode a corpus of documents in the parameters of a model and use the same model to map queries directly to relevant document identifiers. Despite the strong performance of DSI models, deploying them in situations where the corpus changes over time is computationally expensive because reindexing the corpus requires re-training the model. In this work, we introduce DSI++, a continual learning challenge for DSI to incrementally index new documents while being able to answer queries related to both previously and newly indexed documents. Across different model scales and document identifier representations, we show that continual indexing of new documents leads to considerable forgetting of previously indexed documents. We also hypothesize and verify that the model experiences forgetting events during training, leading to unstable learning. To mitigate these issues, we investigate two approaches. The first focuses on modifying the training dynamics. Flatter minima implicitly alleviate forgetting, so we optimize for flatter loss basins and show that the model stably memorizes more documents (+12\%). Next, we introduce a generative memory to sample pseudo-queries for documents and supplement them during continual indexing to prevent forgetting for the retrieval task. Extensive experiments on novel continual indexing benchmarks based on Natural Questions (NQ) and MS MARCO demonstrate that our proposed solution mitigates forgetting by a significant margin. Concretely, it improves the average Hits@10 by $+21.1\%$ over competitive baselines for NQ and requires $6$ times fewer model updates compared to re-training the DSI model for incrementally indexing five corpora in a sequence.
translated by 谷歌翻译